Sequence-Based Protein Crystallization Propensity Prediction for Structural Genomics: Review and Comparative Analysis

نویسندگان

  • Lukasz Kurgan
  • Marcin J. Mizianty
چکیده

Structural genomics (SG) is an international effort that aims at solving three-dimensional shapes of important biological macro-molecules with primary focus on proteins. One of the main bottlenecks in SG is the ability to produce diffraction quality crystals for X-ray crystallography based protein structure determination. SG pipelines allow for certain flexibility in target selection which motivates development of insilico methods for sequence-based prediction/ assessment of the protein crystallization propensity. We overview existing SG databanks that are used to derive these predictive models and we discuss analytical results concerning protein sequence properties that were discovered to correlate with the ability to form crystals. We also contrast and empirically compare modern sequence-based predictors of crystallization propensity including OB-Score, ParCrys, XtalPred and CRYSTALP2. Our analysis shows that these methods provide useful and complimentary predictions. Although their average accuracy is similar at around 70%, we show that application of a simple majority-vote based ensemble improves accuracy to almost 74%. The best improvements are achieved by combining XtalPred with CRYSTALP2 while OB-Score and ParCrys methods overlap to a larger extend, although they still complement the other two predictors. We also demonstrate that 90% of the protein chains can be correctly predicted by at least one of these methods, which suggests that more accurate ensembles could be built in the future. We believe that current protein crystallization propensity predictors could provide useful input for the target selection procedures utilized by the SG centers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CRYSpred: accurate sequence-based protein crystallization propensity prediction using sequence-derived structural characteristics.

Relatively low success rates of X-ray crystallography, which is the most popular method for solving proteins structures, motivate development of novel methods that support selection of tractable protein targets. This aspect is particularly important in the context of the current structural genomics efforts that allow for a certain degree of flexibility in the target selection. We propose CRYSpr...

متن کامل

PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection

X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the p...

متن کامل

News & Views Layout

nature structural biology • volume 5 number 12 • december 1998 1029 Recently, some 200 structural biologists from both academia and industry gathered in Avalon, New Jersey to explore the science and organization of the new field of structural genomics1. The aim of the structural genomics project is to deliver structural information about most proteins. While it is not feasible to determine the ...

متن کامل

Structure of Lmaj006129AAA, a hypothetical protein from Leishmania major.

The gene product of structural genomics target Lmaj006129 from Leishmania major codes for a 164-residue protein of unknown function. When SeMet expression of the full-length gene product failed, several truncation variants were created with the aid of Ginzu, a domain-prediction method. 11 truncations were selected for expression, purification and crystallization based upon secondary-structure e...

متن کامل

Crysalis: an integrated server for computational analysis and design of protein crystallization.

The failure of multi-step experimental procedures to yield diffraction-quality crystals is a major bottleneck in protein structure determination. Accordingly, several bioinformatics methods have been successfully developed and employed to select crystallizable proteins. Unfortunately, the majority of existing in silico methods only allow the prediction of crystallization propensity, seldom enab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014